Generalisation to unseen contexts remains a challenge for embodied navigation agents. In the context of semantic audio-visual navigation (SAVi) tasks, the notion of generalisation should include both generalising to unseen indoor visual scenes as well as generalising to unheard sounding objects. However, previous SAVi task definitions do not include evaluation conditions on truly novel sounding objects, resorting instead to evaluating agents on unheard sound clips of known objects; meanwhile, previous SAVi methods do not include explicit mechanisms for incorporating domain knowledge about object and region semantics. These weaknesses limit the development and assessment of models' abilities to generalise their learned experience. In this work, we introduce the use of knowledge-driven scene priors in the semantic audio-visual embodied navigation task: we combine semantic information from our novel knowledge graph that encodes object-region relations, spatial knowledge from dual Graph Encoder Networks, and background knowledge from a series of pre-training tasks -- all within a reinforcement learning framework for audio-visual navigation. We also define a new audio-visual navigation sub-task, where agents are evaluated on novel sounding objects, as opposed to unheard clips of known objects. We show improvements over strong baselines in generalisation to unseen regions and novel sounding objects, within the Habitat-Matterport3D simulation environment, under the SoundSpaces task.
translated by 谷歌翻译
当与物体交互并发现其内在特性时,人类利用多种传感器方式。仅使用视觉模式就不足以在对象属性背后得出直觉(例如,两个盒子中的哪个更重),因此也必须考虑非视态模态,例如触觉和听觉。尽管机器人可以利用各种方式通过与物体(例如,抓握,举重和摇动行为)通过学习的探索性互动来获得对象财产理解,但仍然存在挑战:一个机器人通过对象探索获得的隐性知识不能直接由另一个机器人直接使用不同的形态,因为传感器模型,观察到的数据分布和交互功能在这些不同的机器人配置中不同。为了避免从头开始学习交互式对象感知任务的昂贵过程,我们为每个新机器人提出了一个多阶段投影框架,用于传输跨异构机器人形态的对象属性的隐式知识。我们使用包含两个执行7,600个对象相互作用的异质机器人的数据集评估了对象范围识别和对象身份识别任务的方法。结果表明,知识可以跨机器人传输,因此新部署的机器人可以在无需详尽探索所有对象的情况下引导其识别模型引导其识别模型。我们还提出了一种数据增强技术,并表明该技术改善了模型的概括。我们在此处发布代码和数据集:https://github.com/gtatiya/implitic-knowledge-transfer。
translated by 谷歌翻译
我们提出了Rapid-Learn:学习再次恢复和计划,即一种混合计划和学习方法,以解决适应代理环境中突然和意外变化(即新颖性)的问题。 Rapid-Learn旨在实时制定和求解任务的Markov决策过程(MDPS),并能够利用域知识来学习由环境变化引起的任何新动态。它能够利用域知识来学习行动执行者,这可以进一步用于解决执行智能,从而成功执行了计划。这种新颖信息反映在其更新的域模型中。我们通过在受到Minecraft启发的环境环境中引入各种新颖性来证明其功效,并将我们的算法与文献中的转移学习基线进行比较。我们的方法是(1)即使在存在多个新颖性的情况下,(2)比转移学习RL基准的样本有效,以及(3)与不完整的模型信息相比,与纯净的符号计划方法相反。
translated by 谷歌翻译